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ABSTRACT 

We show that the previously introduced concept of 
distance on statistical spaces leads to a straightforward 
definition of differential entropy on these statistical 
spaces. These spaces are characterized by the fact that 
their points can only be localized within a certain volume 
and exhibit thus a feature of fuzziness. This implies that 
Riemann integrability of relevant integrals is no longer 
secured. Some discussion on the specialization of this 
formalism to quantum states concludes the paper. 

keywords: Differential entropy, mutual informa- 
tion, statistical spaces, minimal length, quantum states, 
entropic inequalities. 



1. INTRODUCTION 

Differential entropy is the entropy of a continuous ran- 
dom variable. It is related to the shortest description 
length and thus similar to the entropy of a discrete ran- 
dom variable. A basic introduction can be found in the 
book of Cover and Thomas In this paper, we are 
interested in the concept of shortest description length. 
Indeed, in a recent paper Q], we have investigated the 
case of spaces where points are in fact localized within a 
certain volume, i.e. they are statistical in nature. The 
motivation was the existence of a minimal length in phys- 
ical theories. It was possible to introduce a concept of 
distance using Fisher information metric on such spaces. 
In this paper we show that the reasoning leading to the 
definition of a distance is analogous to the usual intro- 
duction of differential entropy in information theory. In 
our case also, care must be taken of the precise meaning 
of minimal distance or shortest description length. 

In this work we only present an outline of our method. 
It is structured as follows. In section 2 the basic defi- 
nitions of differential entropy, Kullback-Leibler distance 
and mutual information are reminded. The following sec- 
tion is devoted to the concept of distance on statistical 
spaces as proposed in pj. Section 4 extends the defini- 



tions presented in section 2 to the statistical spaces. In 
particular, we show that the concept of distance intro- 
duced in the preceding section leads to a mutual infor- 
mation function analogous to the usual one. This is the 
main and new contribution of this paper. Section 5 of- 
fers a discussion of some specific features and outlines 
some open research tracks. It also contains concluding 
remarks. 



2. BASIC DEFINITIONS 

Definition 1: The differential entropy h(X) of a con- 
tinuous random variable X with a density f(x) is defined 
as 



h(X) 



f(x) log f(x)dx 



(1) 



where S is the support set of the random variable. 

It is well-known that this integral exists if and only if 
the density function of the random variables is such that 
the integrals can be defined. This is related to the issue 
concerning the precise meaning of minimal distance. 
The next step consists in establishing the relation of dif- 
ferential entropy to discrete entropy. Then, one proceeds 
to define the joint and conditional differential entropy 
including the entropy of a multivariable distribution. 
The next step is to introduce the relative entropy and 
the mutual information functions. 

Definition 2: The Kullback-Leibler distance or rela- 
tive entropy is defined as 



D(f\\g) = ( f\og f - 
J 9 



(2) 



where / and g are two density functions. 



Definition 3: The mutual information I(X;Y) be- 
tween two random variables with joint density f(x,y) is 
defined as 

I(X;Y) = j f{x,y)\og l^ dxdy. (3) 
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The previous definitions lead to an equation for the 
mutual information given by: 



I(X;Y) = D(f(x,y)\\f(x)f(y)). 



(4) 



An important point is that this provides a link between 
the discrete and continuous cases since the properties of 
the Kullback-Leibler distance and the mutual informa- 
tion are the same in both cases. 



3. DISTANCE ON STATISTICAL SPACES 

We now briefly review the connection between the 
Kullback-Leibler distance and the Fisher Information 
matrix. In order to do so, we start from a generalized 
concept of distance based on the concept of entropy. It is 
often useful to introduce a concept of distance between 
elements of a more abstract set. For example, one could 
ask what is the distance between two distributions be- 
tween e.g. the Gaussian and binomial distributions. It is 
useful to introduce the concept of entropy as a mean to 
define distances. In information theory, Shannon entropy 
flo| represents the information content of a message or, 
from the receiver point of view, the uncertainty about 
the message the sender produced prior to its reception. 
It is defined as 



- ^2 P(i) log p(i), 



(5) 



where p(i) is the probability of receiving the message i. 
The unit used is the bit. The relative entropy can be 
used to define a "distance" between two distributions p(i) 
and g(i). The Kullback-Leibler @ distance or relative 
entropy is defined as 



D{g\\p) = 5>(*)l°g 



p(i) 



(6) 



where p(i) is the real distribution and g(i) is an assumed 
distribution. Clearly the Kullback-Leibler relative en- 
tropy is not a distance in the usual sense: it satisfies 
the positive definiteness axiom, but not the symmetry or 
the triangle inequality axioms. It is nevertheless useful 
to think of the relative entropy as a distance between 
distributions. 

The Kullback-Leibler distance is relevant to discrete 
sets. It can be generalized to the case of continuous sets. 
For our purposes, a probability distribution over some 
field (or set) X is a distribution p : X G K, such that 

1. J x drx p(x) = 1 

2. For any finite subset S C X, f s dSc p(x) > 0. 

We shall consider families of distributions, and param- 
eterize them by a set of continuous parameters 9 Z that 



take values in some open interval M CM. 4 . We use the 
notation pg to denote members of the family. For any 
fixed 9, pg : x <—> pg{x) is a mapping from X to R. We 
shall consider the extension of the family of distributions 
F = {pg\9 G M}, to a manifold Ai such that the points 
p G M. arc in one to one correspondence with the distri- 
butions p G F. The parameters 9 of F can thus be used 
as coordinates on M. 

The Kullback number is the generalization of the 
Kullback-Leibler distance for continuous sets. It is de- 
fined as 



I{ae\\P6 



d 4 xgg (x) lo; 



9e(x) 
Pe(x)' 



(7) 



Let us now study the case of an infinitesimal difference 
between qg(x) — pg+ ev (x) a.no\pg(x): 



I{P6+ev\\Pe) = J d A xpg +ev {x)\og 



pe+ev(x) 
Pe{x) 



(8) 



Expanding in e and keeping 9 and v fix one finds (see e.g. 



J): 



I(p e+ ev\\pe) = 2"(p+e||p)| e=0 + e J'(e)| e=0 (9) 



+ ie 2 I"{e)\ t=0 + O^). 
One finds 1(0) = J'(0) = and 

j4 _ ,_w 1 d Pe(x) 



J"(0) 



dxpg(x) . 

\ \Pe{x) 89^ 

1 dpg(x) 



(10) 



pg(x) d9 v 



We can now identify the Fisher information metric |J| on 
a manifold of probability distributions as 



9viu = / c? 4 
Jx 



1 dpg(x) dpg(x) 

'pg(x) d6» d9 v 



(11) 



It has been show that this matrix is a metric on a mani- 
fold ofprobability distributions, see e.g. 0- 

In [l| we have shown that using the concept of rela- 
tive entropy, one can introduce a concept of a distance, 
equivalent to the Kullback-Leibler distance, on statistical 
spaces. 

Definition 4: Distance on statistical spaces between 
two "points" pgii(x fl ) and qgn^x^) 



I(qg,»(x»)\\pg»(xn) 



d 4 xqg>n (x^) log 



Pe»(x> J ') ' 



(12) 



The metric on the manifold of distributions is given lo- 
cally by 



d xpg(x) 



X 



1 dpg{x) 



pg(x) 86 ^ ) \pg[x) 89 



1 8pg(x) 



(13) 
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and corresponds to the Fisher information matrix. The 
distance between two points and B v on the manifold 
is given by d(A^,B u ) = y/g tlv A> i B v . It was also shown 
in [l| that a Lorentzian metric can be generated in certain 
cases of physical relevance. 

4. MUTUAL INFORMATION ON STATISTICAL 
SPACES 

We follow the presentation of differential entropy pro- 
posed in chapter 9 of 0], The definitions (1) to (4) ex- 
tends directly to the case of statistical spaces. 

It follows that the properties of the differential entropy, 
relative entropy and mutual information can be carried 
out to our framework. In particular, the following rele- 
vant theorems still hold since their proofs are based on 
definitions (1) to (4) or on the Jensen's inequality Q. 

Theorem 1: 

D(f\\g)>0, (14) 
with equality if and only if / = g almost everywhere. 

Theorem 2: 
Chain rule for differential entropy 

n 

h{X 1 ,X 2 ,...,X n )=J2h(X i \X 1 ,X 2 ,...,X i _ 1 ). (15) 

In this case also a corollary of this theorem leads to: 
Theorem 3: 

h(X 1 ,X 2 ,...,X n )<^h(X i ) (16) 

with equality if and only if X\, X 2 , ., X n are independent. 

This leads trivially to the Hadamard's inequality Q. 
This inequality together with the previous theorems en- 
ables to prove that a number of determinant inequalities 
can be derived from information theoretic inequalities. 
They can be found in chapter 16 of 0- 

It seems that most of the concepts valid for differential 
entropy in the usual formalism can be applied straightfor- 
wardly to statistical spaces. This is mostly right. There 
is however a distinction when it comes to the relation 
of differential entropy to discrete entropy. In the usual 
formalism, the differential entropy of a discrete random 
variable can be considered to be infinity. This agrees with 
the idea that the volume of the support set of a discrete 
random variable is zero |3j. 

We are now in a framework where this assumption is 
no longer valid since the existence of a minimal length 
forbids a zero support set. 

A consequence is that it is no longer proven whether 
the entropy of a n-bit quantization of a continuous ran- 
dom variable X is approximately h(X) + n. Indeed the 



Riemann integrability of the density function apparently 
no longer holds. A full investigation of this question is 
still required. 

5. DIFFERENTIAL ENTROPY AND DYNAMICS 
OF UNCERTAINTY 

This concluding section is devoted to a brief discus- 
sion of the link between differential entropy, dynamics 
and the inequalities asserting the degree of uncertainty 
of the concept of information. This link is not new. In- 
deed, the definition of Shannon entropy mixes both un- 
certainty and information measure. Differential entropy 
can then be seen as an assessment of the uncertainty on 
the knowledge of the information contents of a system. 

This paper is the third in a series defining first the 
concept of metric on a statistical space-time |l| and then 
introducing the concept of dynamics in the Fisher Infor- 
mation Metric Q. A new contribution in this paper is 
to show that the definition of the metric on a statistical 
space-time allows to define the same expression for the 
mutual information I(X; Y) as given in eq. (@J). As long 
as it is not required to specify the density distributions 
there is a straightforward transcription from the usual 
macroscopic formalism. This is why the main definitions 
taken from Q are valid in both formalisms. 

These definitions enable in the classical case to prove 
a series of inequalities that are mostly based upon the 
Riemann integrability of the integral of f{x) logs where 
f(x) is the probability density function. Riemann inte- 
grability implies a well-defined concept of limit. In statis- 
tical spaces we no longer have the simplifying assumption 
that the limit of the support set of the random variable 
is zero. As already mentioned this Riemann integrability 
is among the problems to be investigated. 

An even more interesting question is the case of quan- 
tum states. We have borrowed the title of this section 
from a recent paper [f| where the dynamics of uncer- 
tainty for quantum states is thoroughly investigated. 

It is well-known that the quantum equivalent of a bit 
is a qubit whose states are vectors in a two-dimensional 
Hilbert space. It is also well-known that quantum sys- 
tems have vanishing von Neumann entropy. This means 
that a complete information on the state of a system is 
presumed. On the other side, the Shannon entropy is in- 
terpreted as a probability distribution. The adequacy of 
either the von Neumann or Shannon entropy for quan- 
tum states has been the topic of several studies, see for 
instance Q. We obviously fully agree with the option 
adopted in || to rely on differential entropy to investi- 
gate the density evolution. 

In this paper Garaczewski pays special attention 
to various entropic inequalities including those briefly 
mentioned above. In an application example, he ex- 
tends the basic features of his formalism to a so-called 
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information dynamics due to the Schrodinger model 
of evolution of wave packets. He also points out that 
Shannon differential entropy has been used for years in 
the formulation of entropic versions of Heisenberg-type 
indeterminacy relations. Our reference to |5j and 
the reasons to give a brief overview of its goals are 
motivated by the fact that although we do not specify 
our formalisms to quantum states, this is an obvious 
goal of our approach. Indeed, a big advantage is that 
we can use Gaussian distributions as density functions 
which are realistic models for quantum states as used 
in quantum computing. Moreover, we have shown in 
how to easily introduce dynamics into the formalism. 
This is achieved by methods known to every particle 
physicist and consists in imposing symmetries. We are 
thus apparently well-equipped to investigate entropic 
inequalities for quantum states. This is part of our 
agenda of problems to investigate. It must be noted that 
until this investigation is completed, caution must be 
taken not to presume any result within our formalism. 
Indeed, we have no simplifying hypothesis such as the 
vanishing of the support set of the density functions 
and thus can only guess that we will be able to prove 
inequalities. 



* The work of X.C. was supported in part by the US De- 
partment of Energy under Grant No. DE-FG02-97ER- 
41036; Electronic address: calmet@physics.unc.edu 

[1] J. Calmet and X. Calmet, "Metric on a Statistical Space- 
Time," WSEAS Trans, on Circuits and Systems, issue 10, 
vol. 3, pp. 2267-2271, Dec. 2004. 

[2] X. Calmet and J. Calmet, "Dynamics of the Fisher Infor- 
mation Metric," to appear in Phys. Rev. E, 2005. 

[3] T. M. Cover and J. A. Thomas, "Elements of Information 
Theory," Wiley Series in Telecommunications, 1991. 

[4] R. A. Fisher, "Statistical Methods and Scientific Infer- 
ence," 2nd edn. Oliver and Boyd, London, 1959. 

[5] P. Garaczewski, "Differential Entropy and Dynamics 
of Uncertainty," arXiv:quant-ph/0408192 /3, 17 January 
2005. 

[6] S. Kullback, "Information Theory and Statistics," John 

Wiley, New York, 1959. 
[7] P. G. L. Mana, "Consistency of the Shannon Entropy in 

Quantum Experiments," Phys. Rev. A 69, 062108, 2004. 
[8] C. C. Rodriguez, "The Metrics Induced by the Kullback 

Number," in J. Skilling (ed.), Maximum Entropy and 

Bayesian Methods, pages 415-422, 1989. 
[9] C. C. Rodriguez, "Are We Cruising a Hypothesis 

Space?," in the Proceedings of Maximum Entropy and 

Bayesian Methods 1998, physics/9808009 
[10] C. E. Shannon, "A Mathematical Theory of Communi- 
cation," Bell System Technical Journal 27:379-423, 623- 

656, July and October 1948. 



* Electronic address: calmet@ira.uka.de 



